Translation of sublanguages by subgrammars
نویسندگان
چکیده
This paper discusses the performance of two data-driven translation methods for the translation of a very constrained sublanguage: dates. As a first result, we show that an example-based method is outperformed by a statistical method for the translation of dates from Chinese into English when small random training corpora are used: 750 random examples suffice to translate almost perfectly a corpus of 4,018 dates for both methods. As a second result, we prove that 58 dates theoretically suffice to translate the same corpus of 4,018 dates perfectly and we verify this fact experimentally with an example based method, while a statistical method fails at translating 345 dates in the 4,018 dates to translate.
منابع مشابه
SMT for restricted sublanguage in CAT tool context at the European Parliament
This paper shows that it is possible to efficiently develop Statistical Machine Translation (SMT) systems that are useful for a specific type of sublanguage in real context of use even when excluding the exact Translation Memory (TM) matches from the test set in order to be integrated in CAT "Computer Aided Translation" tools. It means that the included part is quite different from the existing...
متن کاملThe Organization Of The Rosetta Grammars
In this paper the organization of the grammars in the Rosetta machine translation system is described and it is shown how this organization makes it possible to translate between words of different syntactic categories in a systematic way. It is also shown how the organization chosen makes it possible to translate 'small clauses' into full clauses and vice versa. The central concept worked out ...
متن کاملThe Significance of Sublanguage for Automatic Translation
But when we consider the automatic translation of specialized language, we are forced to be more precise. We must describe sublanguages as coherent, rule-based systems. The attempt to write grammars for special-purpose sublanguages raises a number of theoretical and practical problems, which are only now being intensively discussed. But since the only path to high-quality automatic translation ...
متن کاملSublanguages In Machine Translation
There have been various attempts at using the sublanguage notion for disambiguation and the selection of target language equivalents in machine translation. In this paper a theoretical concept and its implementation in a real MT application are presented. Above this, means of linguistic engineering like weighting mechanisms are proposed.
متن کاملMachine Learning of Language Translation Rules
The purpose of this paper is to present learning methods for creating language translation rules from multilingual text samples. The languages concerned are controlled languages, i.e. they are domain specific sublanguages with ambiguities eliminated by restricting the vocabulary and syntax. Learning methods presented here enable a supervised, human-assisted learning of generalised translation r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009